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Abstract 

Random events in space and time often exhibit a locally dependent structure. When 
the events are very rare and dependent structure is not too complicated, various studies 
in the literature have shown that Poisson and compound Poisson processes can provide 
adequate approximations. However, the accuracy of approximations does not improve 
or may even deteriorate when the mean number of events increases. In this paper, we 
investigate an alternative family of approximating point processes and establish Stein's 
method for their approximations. We prove two theorems to accommodate respectively 
the positively and negatively related dependent structures. Three examples are given 
to illustrate that our approach can circumvent the technical difficulties encountered 
in compound Poisson process approximation [see Barbour &: Mansson (2002)] and our 
approximation error bound decreases when the mean number of the random events in- 
creases, in contrast to increasing bounds for compound Poisson process approximation. 
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1 Introduction 

Random events in space and time often exhibit a locally dependent structure. When the 
events are very rare and the dependent structure is not too complicated, a natural approach 
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is to declump the events into clusters then approximate the positions of the clusters by a 
suitable Poisson process and the sizes of the clusters by independent and identically dis- 
tributed random elements, as well documented in Aldous (1989). Consequently, compound 
Poisson and marked Poisson processes are often widely accepted as the 'best approximate 
models' for clustered rare events. 

The first attempt to estimate the errors of Poisson process approximation seems to go 
back to Brown (1983) with errors measured in the total variation distance, while the errors 
in the Levy-Prohorov distance were not studied until Jacod & Mano (1988) and Nikunen & 
Valkeila (1991) [see also Xia (1993)]. All these studies are based on the stochastic calculus 
approach with a filtration, a compensator and coupling techniques as the tools to quantify 
the distances. Barbour and Brown (1992), clearly inspired by the success of Stein's method 
in multivariate Poisson approximation [Barbour (1988)], laid down a general framework for 
using Stein's method to estimate the Poisson process approximation errors. Their framework 
can be well adjusted for errors expressed in terms of Janossy densities. Palm distributions 
and compensators [see Barbour, Brown & Xia (1998) and Xia (2005)]. In terms of compound 
Poisson process approximation, there seems no major advance until Arratia, Goldstein & 
Gordon (1989) who replaced the original point process with a new one carrying the infor- 
mation of locations and cluster sizes separately so that the Stein-Chen method for Poisson 
approximation can be employed to obtain useful error bounds. There are enormous advan- 
tages for this approach if one can successfully declump the point process, but the procedure 
of declumping is far from obvious in applications. By contrast, Barbour & Mansson (2002) 
avoided declumping totally by setting a framework of Stein's method so that the quality of 
approximation can be studied directly, and the authors summarized that the direct approach 
'has conceptual advantages, but entails technical difficulties' in p. 1492. One of the main dif- 
ficulties is that Stein's factors, like their counterparts for compound Poisson random variable 
approximation [see Barbour, Chen & Loh (1992), Barbour & Utev (1998) and Barbour & 
Utev (1999)], are generally too crude to use unless more conditions are imposed such as the 
compound Poisson process is very close to a Poisson process. An immediate consequence is 
that the error bounds obtained often deteriorate when the mean of the point process in- 
creases, i.e., more information is available. On the other hand, using the improved estimates 
for Stein's factors for Poisson process approximation in Xia (2005) [cf Brown, Weinberg & 
Xia (2000)], Chen & Xia (2004) managed to produce error estimates for Poisson process 
approximation to short range dependent rare events and the estimates will remain small 
(but not improve either) when the average number of events increases. 

It is well-known that the central limit theorem often exhibits the large sample property, 
i.e. the larger the sample size, the better the approximation, as evidenced by the Berry- 
Esseen bound [see Chen and Shao (2004)]. If we are interested in the total counts of rare 
and weakly dependent events, the Poisson law of small numbers is the cornerstone of the 
area. However, the Poisson approximation error does not enjoy the large sample property 
when more rare events are counted [Barbour & Hall (1984)]. The shortcoming is due to 
the fact that a Poisson distribution has only one parameter to fiddle with while a normal 
distribution has two parameters. When more parameters are introduced, this property can be 
recovered [see Presman (1983), Kruopis (1986), Cekanavicius (1997), Barbour & Xia (1999), 
Brown & Xia (2001), Rollin (2005)]. In fact. Brown & Xia (2001) discovered a large family 
of distributions that can achieve the same purpose. 
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The success of compound Poisson process approximation essentially hinges on the fact 
that the events are very rare. It is tempting to ask whether the approximation theory is 
still valid when the events are less rare, more heavily dependent and the mean number of 
events increases? One way to tackle this problem is to keep the approximating process as 
a Poisson process but weaken the metric for quantifying the difference between point pro- 
cesses [Schuhmacher & Xia (2008)]. The weaker metric will naturally limit its applicability. 
The second approach is to introduce more parameters into the approximating point process 
models. To put the idea in practice, Xia & Zhang (2008) introduced a family of point process 
counterparts of approximating distributions suggested in Brown & Xia (2001), and named 
them as the polynomial birth-death point processes, or PBDP in short. In particular, Xia 
& Zhang (2008) bounded the distance between the Bernoulli process with a constant suc- 
cess probability and a suitable PBDP in terms of the Barbour-Brown distance (defined in 
section [2] below, see also Barbour & Brown (1992)). The assumption of the constant success 
probability plays the crucial role there because the symmetric structure enables the authors 
to construct a suitable coupling to directly compare the two distributions. The pilot study 
shows that, for the Bernoulli process with the same success probability, it is possible to 
recover the large sample property for PBDP approximation. The purpose of this paper is to 
demonstrate that the large sample property prevails among a large group of point processes 
when these PBDP are used as approximating models. To this end, we set up the Stein equa- 
tion of PBDP approximation and establish its Stein factors so that one can directly estimate 
the difference between the distribution of a general point process and that of a PBDP. 

Our paper is arranged as follows. In section [21 we briefly review the polynomial birth- 
death point processes introduced in Xia & Zhang (2008), lay down a foundation of Stein's 
method for their approximation and conclude the section with estimates of Stein's factors 
in terms of the Barbour-Brown metric. To make our paper reader- friendly, we postpone the 
technical proofs of Stein's factors to section [51 Section [31 is devoted to point processes with 
locally dependent structures which are analogous with those in Chen & Shao (2004). We 
state two theorems for error estimates of PBDP approximations, respectively for positively 
and negatively related dependence. The proofs of these theorems are rather complicated so 
we leave them to the last two sections (sections [Sand [7]) of the paper. Examples are provided 
in section [H to illustrate the key steps of applying the main theorems. 



2 Stein's method for polynomial birth-death point pro- 
cesses 

The family of approximating distributions in Brown & Xia (2001) was introduced through 
the invariant distributions of birth-death processes. For ease of use, they focused on the birth 
and death rates as the polynomial functions of the states of the process, and consequently 
called the invariant distribution as polynomial birth-death distribution. More precisely, let 

ak = a + bk, \/ k>0; (3k = k + (3k{k - 1), \/ k > 0, (2.1) 

where a>0, 0<6<1, /3>0. A birth-death process with birth rates {ttfc} and death rates 
{(3k} must be ergodic. As in Brown & Xia (2001), we let Zn{-) ■= {Znit) : t > 0} be such a 
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process with initial value n and use 7ra,b;i3 or simply tt when there is no confusion to stand 
for the invariant distribution. 

Let r be a compact metric space with metric do bounded by 1 and Borel a-algebra ^(F) 
generated by do. Set U,Ui,U2, ■ ■ ■ as independent and identically distributed F- valued ran- 
dom elements with distribution fi. In this paper, the expression X^ii ^Ui always implies that 
the nonnegative integer random variable X is independent of {f/j : ^ > 1}. We call Z a 
polynomial birth-death point process [see Xia & Zhang (2008)] if it can be expressed as 



for Z ~ vra,;,;/3, and denote =2'(Z) by TTa,b;i3;fj, or simply tt when there is no confusion. We now 
give a few examples to illustrate that the definition is a natural extension of the polynomial 
birth-death distribution. 

Example 1 Suppose Z follows Binomial (n,p), then Z reduces to a binomial process. 

Example 2 If Z is a Poisson random variable with mean a, then Z becomes a Poisson 
process on F with mean measure a/i. 

Example 3 When Z has a negative binomial distribution, we call Z a negative binomial 
process. 

Remark 2.1 There are two possible ways to define a negative binomial process. The one 
we defined here does not have the property of independent increments while if we define it as 
a compound Poisson process with clusters following a logarithmic distribution, then it does 
have the property of independent increments. Nevertheless, the two distributions converge 
when the intensity of the Poisson component becomes large [see Remark 14.71 below] . 

Now we construct a Markov process with invariant distribution tt = a,h;ii;ii- Allowing re- 
peats of points, each finite integer- valued measure on F can be written as ^ = Ym=i ■ Since 
the points not necessarily distinct, we introduce the notation \xi, ■ ■ ■ , XnJ to 

stand for the collection of the n points. In this paper, we do not distinguish XlILi with the 
collection Ixi, ■ ■ ■ , Xn\, or a configuration with n particles respectively located at xi, ■ ■ ■ , x„. 
For example, when we say a site/point x or a particle at x in ^, it means that ^{{x}) > 1. 

For each measure ^ on F, we denote its total mass by |^|. Let Jif be the class of all possible 
finite integer- valued measures (also known as the configurations of point processes) on F and 
let be the smallest a-algebra in making the mappings ^ h-> C,{C) measurable for 

all relatively compact Borel sets C C F. For each suitable measurable function h on Jf, we 
define 



z 



i=l 




[a + b\^\) {m{^ + 6u) - h{0) 

+ (l + /3(|e|-l)) (EMe - - MO) , 



(2.2) 
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where, for = Yl^=i^xi, is a uniformly distributed random element on the collection 
Ixi, ■ ■ • , Xn^- In other words, V{^) is equally likely to be one of Xi, . . . , Xn- A particle system 
Zg(-) := {Zg(t) : t > 0} with the generator =2/ evolves as follows: 

• with rate a a new particle immigrates to F and settles at a site according to /i; 

• with rate b an existing particle gives a birth, and the new born particle is also located 
at a site chosen according to fi; 

• with rate 1, an existing particle suicides; 

• with rate /3, an existing particle kills another existing particle. 

We call such a Markov process as a birth-death system. It's not difficult to check that the 
birth-death system has the unique invariant distribution 7ra,b;i3;fi- Noting that for any ^ G Jif, 
{|Zj(t)| : t > 0} is a birth-death process with rates ([211]), we have ^{\Z^{-)\) = ^{Z\^\{-)). 

Therefore, ^ {Z^{t)) = ^ {T.f=i^ ^u,) if = ^(Er=i^c/J- In particular, we have 
^ {Z,{t)) = ^ (Zf:? 6u^) . 

Bearing in mind the Stein equation suggested by Barbour & Brown (1992), the natural 
choice of the Stein equation for the generator is 

^HO = f{0-^{f) (2.3) 

for suitable functions / on J^, where 7r(/) := J f{^)Tz{d^). We now consider the question 
of the existence of an h that solves the equation fl2.3p . 

Proposition 2.2 For any bounded function f on J^, 

POO 

hfiO:=- (E/(Z^(t))-7r(/))c/t 
Jo 

is well defined, and is a solution of Ii2.3\) . 



Proof. Let {Ui} be independent //-distributed random elements which are independent of 
{Z^{t) : t > 0}. Pair {^7^, 1 < i < |^|} with the points in ^, define ^' = J^fli^Ui, and 
construct {Z^/(t) : t > 0} from {Z^(t) : t > 0} by replacing the points in ^ with the paired 
counterparts in Let f be the last death time of all the points in C,- We have 

/*00 POO 

/ |E/(Z5(t))-E/(Z^,(t))Mt< / E(2||/||l,>i)rft = 2||/||Ef<oo, 
Jo Jo 

since f is stochastically smaller than the maximum of |,^| independent and identically dis- 
tributed exp(l) random variables. 

Next, define f{n) = E/(^"^i 6u,) for all n > 0, then 

POO POO 

/ |E/(Z5,(t))-7r(/)|rft< / |E/(Z|5|(t))-7r(/)|dt<oo 
Jo Jo 
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due to the positive recurrence of the Markov chain {Z\^\{t), t > 0}. Hence, 
|E/(Z5(t))-7r(/)|dt 

POO 

|E/(Z5(t)) - lEfiZ^imdt + / m{Z^,{t)) - 7z{f)\dt < oo, 

Jo 

which imphes that hf is well-defined. 

To establish f l2.3p . let = inf{t : Z^(t) ^ C,}, which has an exponential distribution with 
parameter a|^| + Then 

POO 

hfiO = - / (E/(Z5(t))-7r(/))dt 
Jo 

POO 

= -(/(O - 7r(/))ET5 - E / (E/(Z^(t)) - 7z{f))dt 

= - ^f ::'^^ Eft(z,(r,)) 

/(0-M/) , «i^i/rMe + ^.)/iW + (i + /3(iei-i))/rMe-^.)ew 



and (12 .Sp follows by rearranging the above equation. □ 



The metric used for quantifying the differences of two point processes is defined as follows 
[see Barbour & Brown (1992)]. Let be the class of cio-Lipschitz functions m on F such 
that \u{x) — u{y) \ < do^x.y) for all x, y G F. For any two measures pi and p2 on F, define 

r 0' 

( 1, 

For any configurations ^ = Yl^=i^xi and r] = Yl^=i^yi ^ with n > 1, di{^,ri) can be 
represented as 



if 


IP1 




IP2I 


= 0, 


if 






IP2I 


7^0, 


if 


\pl\ 




IP2I 





1 

di{^,v) = min- Vrfo 



■^15 UcrU) J 

o- n — ' 

i=l 



where the minimum is taken over all permutations a of (1, . . . , n). The Barbour-Brown metric 
d^ between point process distributions is defined as 

ci2(P,Q) := sup|P(/) - Q(/)| = inf ^Erfi(e,r/), 

where the supremum is taken over all functions in 

^:={/: 1/(0-/(^)1 <^i(e,^), Ve,r/e JT}, 

and the last equation is due to the duality theorem [see Rachev (1991), p. 168]. The metric d^ 
is a particular kind of the well-known family of Wasserstein metrics. It is worthwhile to point 
out that, since d\<\^ all functions in ^ are bounded and Proposition 12.21 ensures the exis- 
tence of solutions of Stein's equation (12.31) for these functions. Historically, the Wasserstein 
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metrics were motivated by the classical Monge- Transportation problem. In our context, we 
will handle the 'transportation problem' in two steps, i.e. to form 'sandpiles' by assembling 
local points to designated centers and then transport the 'sandpiles' of the point process 
being approximated to the corresponding 'sandpiles' of the PBDP. 

The following Lemma is often useful for comparing two different approximating polynomial 
birth-death point processes. 

Lemma 2.3 We have 

d2 (■^ai,6i;^i;/ii5 7ra2,b2;/92;/i2) — '^to (^ai ,f)i;/3i ) ^02 ,62;/32 ) + ^^l (/^l ) /^2) , 

where for two probability measures Qi and Q2 on Z+ := {0, 1,2,...}, 

dUQi,Q2) := sup \Q^{A) - Q2{A)\. 



Proof. Using the Kantorovich- Rubinstein duality theorem [Rachev (1991), Theorem 8.1.1, 
p. 168], we can couple together Zi ~ 'n'ai,bi;l3i' ^2 ~ '^a2,b2;l32y sequences of F-valued 

random elements Tu ~ /ii and ~ /i2, « > 1, such that 

dtv (7rai,bi;/3i, Vra2,62;/32) = IP(^1 7^ ^2), 

Edo(ni,T2i) = di (/ii,/i2) for all i > 1, 
and {{Tii,T2i), i > 1} are independent and independent of (^1,^2). Then 

/ Zi Z2 

d2 ('n'ai,6i;/3i;Ati,7I'a2,62;/32;M2) < Ec?i <^rn , '^r 



,1=1 1=1 



Z2 



< nzi^z,)+iiu, 5^ 5.,,, 5^ 5.2. 



i=l i=l 



Z^ = Z2} P(Zi = Z2] 



( ^ z, 

< C^to (7rai,bi;/3i,Vra2,62;/32) + E < ^ C?| 

L ^ i=l 



(Ti-m T2i) 



Z^ = Z2) V{Z 



1 — ^2] 



□ 



(2.4) 



< dtv{j^axf)r,fix-.'^a2,h2\P-2) + di{fli, ^2), 

completing the proof. 

In applications of Stein's equation, one will encounter the following quantities: 
Cn := sup{|/i/(e + 6,)-hf{^ + 6y)\: f eJ^,^e ^, |e| = n}, 
with C_i := 0, 

A2/i(e; X, y) := + S, + Sy) - h{C + 4) - + Sy) + /i(0, ^ejr, x,yeT, 

and 

A2MO :=sup{|A2/i(e;a;,i/)| : x,yET}. 

The following estimates, often known as Stein's factors, are usually needed in applying 
Stein's method. If fact, the success of Stein's method is centered around the quality of 
these estimates. 
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Theorem 2.4 (i) For n > 0, 



Cn<mm{l,—^ r + -,7 7T7 r • (2.5) 

^ '2(n + l) a'(aA6)(n + l)' ^ ^ 



(ii) For any f e^, Jif, 



A.M«)<^ + J. (2.6) 



Remark 2.5 The estimates in Theorem 12.41 are of the correct order. In fact, if we take 
P = b = 0, the PBDP becomes a Poisson process and the estimates for the Poisson process 
are known to be of the correct order [see Xia (2005)]. 



3 Locally dependent point processes 



A point process S on F is defined as a measurable mapping of some fixed probability space 
into {Jif,^{J^)) and X{dx) = lES((ix) is said to be the intensity or mean measure of S 
[Kallenberg (1983), pp. 13-14]. A point process is said to be simple if it has at most one 
point at each location. For a point process S on F with finite mean measure A, the family 
of point processes {E^ : x G F} are said to be reduced Palm processes associated with E (at 
X G F) if for any measurable function / : F x Jif — 1R_,_ := [0, oo), 

lE(^j^f{x,E-5,)E{dx)^ = j^]Ef{x,E,)\{dx), (3.1) 

[Kallenberg (1983), Chapter 10]. Intuitively, the reduced Palm distribution is defined 

through the Radon-Nikodym derivative as follows: 

P(H. G B) = for aU B G ^(^). 

When S is a simple point process, it can be interpreted as the distribution of E save one 
point at X conditional on there is one point at x. 

In this paper, we also need the second order reduced Palm processes E^y of the point process 
S at X, y G F defined as the processes satisfying 

E (^jj^J{x,y-E - 6, - 6y)E{dx)iE - 6,){dy)^ = jj^Ef{x,y-E,y)\^'\dx,dy) (3.2) 

for any measurable function / : F^ x — )■ 1R+, where X'^'^^ (dx , dy) = ]EjE{dx){E — 5x){dy) 
is called the second order factorial moment measure of E [Kallenberg (1983), §12.3]. The 
second order reduced Palm distribution ^E^y can also be viewed as the Radon-Nikodym 
derivative 

, E[H(rfx)(S - 5^.)(t^y)l|=-5 -5 eml , ^ 
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For G and a Borel set B G T, we denote as the restriction of ^ to B, i.e. 
^\b{C) = ^{B n C) for all Borel sets C C F. We call {A^ : x G F} a type-I neighbourhood if 
X G A^. G ef^(F) for all X G F and the mapping 

Fx^^Fx^:(x,0^(x,eU|) 

is product measurable [see Chen & Xia (2004), pp. 2547-2548 for further discussions]. We 
say that {^xy 

: x,y e T} is a type-II neighbourhood if {x, y} C Ar,y G ^(T) for all x, ?/ G F 
and the mapping 

r2 X ^ ^ f2 X ^ : ((x, y), ^ ((x, y), eUgJ 

is product measurable. We now define the locally dependent structures studied in this paper. 



Definition 3.1 A point process H is said to satisfy the type-I local dependence if there 
exist two type-I neighbourhoods {A^ : x G F} and {B^ : x G F} such that A^, C 
Jff (E.j.\Ag) = =^ (S|^c) , S|bc is independent of E\a^, and '^xIb^ is independent of 'Ex\a^ for 
all X G F. A point process S is said to satisfy the type-II local dependence if there exist 
two type-II neighbourhoods {A^y : x,y E T} and {B^y : x,|/ G F} such that A^y C B^y, 
Jtf [ExylAgJ = =^(H|a|J, S|bc^ is independent of and S^jyl^i^ is independent of 

-xyU^y for all x,y eT. 



The locally dependent structures introduced here are parallel to, but a little stronger 
than, those in Chen & Shao (2004). The condition =Sf (S^-I^c) = ^ (E\Ag) can be loosely 
interpreted as 'E{dx) is independent of 'E\a^- One may easily establish sufficient conditions 
for the locally dependent structures by imposing conditions on neighbourhoods containing 
balls [see the descriptive definitions in Barbour & Xia (2006)]. 

To state the error estimates of the PBDP approximation to locally dependent point pro- 
cesses, we need to introduce the following notations. Let Q = {Gi, . . . ,Gk} C ^(T) be 
a partition of F, and we choose G F such that sup^g^. (io(s, ti) is as small as possible, 
i = 1, . . . , k. Note that tj, regarded as the 'designated center' of the set Gi, is not necessarily 
in Gi. We define ^gorj := Yl\=il{Gi)^ti for 7] G M'. The mapping is to 'assemble' all the 
points of the configuration rj in each Gi to its center tj. If we set dQ{Q) as 



doiQ) = max sup do{s,ti), 

l<i<ks^Gi 

then it is easy to check that 

diir],^gor]) <doiQ). (3.3) 

Let ti be a positive constant to be chosen in applications, and we take u = 2 for our examples 
in Section m Let J^tv be the set of indicator functions of all sets in I^{J^). For a point 
process S, we define 



r.(S) := 4P(S(5:) + 1<^ 

4m + 10 I 
+ max sup E [/(^go(H|ijj)) - /(^c;o(H|ijj) + 5t^) 
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Similarly, fa;(S) is defined by replacing all the conditional expectations/probability in the 
definition of r^CE) with expectations/probability. It is worthwhile to point out that the type-I 
local dependence implies f^iE) = f^iEx). Let 

ei,,(S) = r^{E)E{A,)E{B^ \ A,) + f,(S) [E{A,) + l] E{A,)/2 + E{A,)lE[r., (S) 
ei,,(S,) = r,(S,)S,(A,)S,(5, \ A,) + f,(S,) [E,{A,) + l]S,(A,)/2 + H,(A,)E[r, (S) 

e2,x(S) = r,(H)H(5,\A,)+f,(H) + E[r,(H)H(5,)], 
e2,x(S,) = r,(H,)H,(5, \ A,) + f,(S,) + E[r,(H)H(5,)]. 

In terms of the type-II local dependence, we define r^. ^ and f^ y in the same way as r^, and 
fx respectively, but with replaced by Bxy We then set 

ei,x,y{^) = rx,y{E)E{Axy)E{Bxy\Axy)+fx,y{E){E{Axy) + l)E{Axy)/2 
+E{Axy)nrx,y iE)E{Bxy)], 

^l.x.yi^^x.y) ^x,y(^^x,y^^x,y(^Axy^^x,yi^Bxy \ Axy^ ~\~ Tx^y(^^x,y^(^^x,y(^Axy^ ~\~ l).^x,?/(^x?/)/2 

+ Ex,yiAxy)'E[rx,y (E) E{Bxy)], 

e2,x,j/(2) = rx,y(E)E{Bxy) +fx,y{S) + 'E[rx,y(E)E{Bxy)], 
^2,x,y(^^x,y^ ^x,y(^^x,y)^x,y(^Bxy^ ~\~ T x^y{^^x,y^ '^\j'x,y ('^) '^(-^xj/)] • 

Theorem 3.2 Assume that the point process E onV with finite mean measure A satisfies 
Var(|S|) > 1E|S| and the type-I local dependence. Let v{dx) = \{dx)/\X\, b = [Var(|H|) — 
E|H|]/Var(|S|), a = (1 - 6)|A|, then 

d2{^E, 7Za,b-o;u) < 2rfo(6^) + ^ E [(1 + 6)(ei,y(H^) + ei,y{E)) + bfy{E)Ey{Ay) + be2,y{Ey)] \{dy). 

Theorem 3.3 Assume the point process E onT with finite mean measure X satisfies Var(|S|) < 
E|H|, the type-I and type-II local dependence. Let 

|A|-Va.(|H||VE^"pL%|^l)E|Hr ^ HA| + - |A|). (3.4, 

and 

u{dx) = ^ (^\{dx) + f3 X^^\dx,dy)^ . (3.5) 

IfP>0, then 

d2{^E,7Tafi-Au) < 2rfo(e?) + ^]E(ei,,(S,) + ei,,(S))A(rfx) 

+/3 JJ 'E{ei^x,y{'^xy) + (^l,x,y{'^) + (^2,x,y{'^xy)) X^'^\dx,dy). 

Remark 3.4 When one applies these theorems, it is advisable to leave the choice of Q to 
the last stage so that an optimal bound with the best possible order can be achieved. 
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A less noticeable fact is that if one takes do{x, y) = 0, i.e. a pseudometric on F, and 
Q = {r}, then ^2 reduces to dtv for the total counts of point processes, so our theorems also 
cover the PBDP approximation to the total counts of locally dependent point processes in 
the total variation distance. 

The proofs of the two theorems will be given in sections [6] and [71 In the next section, let 
us look at three examples to see how the theorems perform in applications. 



4 Applications 
4.1 Bernoulli process 

Let r = [0, 1], do{x, y) = \x — y\, Ii, . . . , In be independent Bernoulli random variables with 

= 1) = 1 - P(/, = 0) = p,, 1 < 2 < n. 

Define S = Y17=i h^i/n- This simple point process is particularly useful for proving the 
Poisson process limit theorems for the extreme value theory [Embrechts, Kliippelberg and 
Mikosch (1997), Chapter 5]. It was proved in Xia (1997), Proposition 3.6 [see also Ruzankin (2004)], 
that the accuracy of Poisson process approximation to =Sf S is of order X]r=i Pi I ^l=i Pi 
the order can not be improved when n becomes large. When pi's are equal to p, Xia & 
Zhang (2008), making use of the symmetric nature of the distribution =SfS, proved that an 
appropriate PBDP can approximate J^'E with approximation error of order + p) A 
However, when pj's are not the same, the techniques employed in Xia & Zhang (2008) will 
not work and we demonstrate below that our theorems can be applied to this case. 

First of all, it is easy to verify that S has mean measure X{dx) = Y17=iPi^i/nidx) and 
its second order factorial moment measure is X^'^^{dx,dy) = '^i<:i^j<:nPiPj^i/nidx)Sj/n{dy). 
Clearly, 1E|S| > Var(|5|), so we can apply Theorem 13.31 to estimate the approximation error 
for ^E. 

To identify the approximating PBDP distribution, we let 

n 

A, = J >2, 

i=l 

A2 



f3 



|A|2-A2-2|A|A2 + 2A3' 
a = |A| + /3(|Ap-A2), 



[cf Brown and Xia (2001), Theorem 3.1] and 



uidx) = - ( \(dx) + (3 1 X^'^Udx,dy)] = - i \(dx) + /3'S^(\\\ ~ pi)pi5i/n{dx) 

Next, we set up an appropriate partition ^ of P = {Gi, . . . , Gk\- Let 1 < tii, . . . , < n 
such that Ml + ■ ■ ■ + Mfc = n. So = 0, Sj = + uj for 1 < j < k. Set Gi = [0, si/n] and 
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Gj = {-^, for 2 < j < k. We choose tj as the middle point of the interval Gj, I < j < k, 
so that do{Q) = maxi<j<kUj/{2n). Define Wj = '^iLsj^i+i^iy 1 ^ J ^ ^ 

K := max max dtv{^{Wj - 4 - Ii,),^{Wj - - 4 + 1)) 

l<j<A; Sj_i+l</i7^i2<Sj 

1 

< max max 1 A 



where the inequality is due to Lemma 1 of Barbour and Jensen (1989). We take = 

-[Xj", Axy Bxy {-^J y} ^ ^ 2, then ^xi^Ax) ^xi^Bx) ^xyi^Axy) ^xyi^Bxy) 0, 



P(|2|-4-4)<^) <0(|A| 



'2\ 



hence all of Vx, Vx, rx,y and Vx^y are bounded by 0{n/a). Applying Theorem 13.31 gives the 
following estimate. 

Theorem 4.1 With the above setup, if P > , then 

d2{^'B,TTafi-p-u) < max Uj/n + 0{kX2/\\\). 

l<j<k 

As a special case, we now assume p^'s are equal to p, and take k = O {{n{l — p)/py^^) , 
O ((pnV(l - p))i/3) , j = 1, . . . , fc, then 

K = o(lA ^ 



Uj 



(np^(l — p)y/^ 

Hence, the following corollary is immediate. 

Corollary 4.2 For the Bernoulli point process H = -^jf^i/n? where {/j, 1 < i < n} are 

independent and identically distributed Bernoulli random variables with P(/i = 1) = p, let 
P = (n-i)(i-2p) ' " = ^P(^ ^ P)/ (1 ^ 2p), u{dx) = i ^"^1 6i/n{dx), then 

provided p < 1/2. 



Remark 4.3 The bound f l4.ip is not as good as the bound O ^(^ +p) A derived in 

Xia & Zhang (2008) when p is fixed and n becomes large. Nevertheless, our method does 
not rely on the specific symmetric structure of the Bernoulli process H and the bound is still 
valid even if the success probabilities for the Bernoulli random variables vary moderately. 



Remark 4.4 A Poisson process approximation to the Bernoulli process is justified when 
p — and np A. However, in applications of extreme value theory, the value p is often 
fixed while n is large, so our theory provides a more practical alternative. 
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4.2 Compound Poisson process 



Barbour & Mansson (2002) considered compound Poisson process approximation in d2 dis- 
tance. The Stein factors for both compound Poisson random variable and process approx- 
imations are generally too crude to use unless they are sufficiently close to their Poisson 
counterparts or satisfy some other restrictive conditions. In this example, we will show that 
our PBDP, suitably chosen, will converge to the compound Poisson process when its cluster 
distribution is fixed and the mean of the Poisson process component becomes large, regard- 
less of whether the compound Poisson process is sufficiently close to a Poisson process or 
not. 

To begin with, let S = 'Y^^i "^^ii where {Xi} are independent Poisson processes on P with 
mean measures {/ij} respectively. For brevity, we write H ~ CP(/xi, /i2, . . . ). It is easy to see 
that Var(|S|) > 1E|H| with equality holds if and only if /ij = for all j > 2. 

Suppose that we have a partition Q = {Gi, . . . ,Gk} of P. 



Theorem 4.5 Let \{dx) = Yl'iLi'^f^ii^^) > ^{dx) = \{dx)/\\\, and 







2 


Z—ii= 


=1 ' 







Then 

(1 \ '3 1 I 

max lA^=== ^-=^' '^^' +2do(g). (4.2) 



Remark 4.6 Suppose the cluster distribution is fixed everywhere and fii{G) — )■ oo for every 
G G i^(r) such that f^i{G) > 0, then the upper bound given in (14. 2 p has the order o(l). To 
this end, one can partition P into sets with diameters small enough, then for each set Gi with 
IJiiiGi) > 0, one can find fJ,i{Gi) as large as one wishes. Furthermore, suppose P is a simply 
connected domain in M"' with smooth boundary, do{x,y) = \x — y\Al, and /Xi is proportional 
to the Lebesgue measure, i.e. points are homogeneous on P. Then, the upper bound given in 
(14. 2 p has the order O (^\fJ'i\~'^^ ■ As a matter of fact, one can partition P into boxes with 

the same diameter of order O then combine the parts at the boundary of P to 

their adjacent boxes totally belonging to P, to obtain Q. 



Remark 4.7 The other possible way to define negative binomial process is through a com- 
pound Poisson process having a Poisson process of clusters and each cluster carries a random 
number of points that follows a logarithmic distribution. Remark 14.61 ensures that if the log- 
arithmic distribution for the clusters is fixed and the Poisson process is homogeneous, then 
the process will converge to our PBDP distribution when the mean measure of the Poisson 
process becomes large. 



Proof of Theorem \4.5\ A measure /i is called diffuse if for every point x e P, y^({x}) = 0. If 
{/ij} are not diffuse, we can enlarge the space P if necessary and take diffuse measures /i" 
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such that = for z > 1 and maxj>i /ij) — > as — oo. We then apply the 

Kantorovich-Rubinstein duahty theorem [Rachev (1991), Theorem 8.1.1, p. 168] to couple 
two sequences of F- valued random elements Tij ~ fJ'i/\fJ'i\ and r^j ~ i,j > 1, such 

that 

and {{Tij,Tlj), i,j > 1} are independent and independent of {Xi, i > 1}. Let ~ 
CP(/.^/i-,...),then 

i=i j=i i=i j=i 



< E 



< E 



< max(ii(/i^, /ij). 

i>l 



This observation, together with Lemma 12. 3[ ensures that we can assume, without loss of 
generality, that {/Zj} are all diffuse. Otherwise, we can approximate each with a suitable 
PBDP distribution and then take the limits. 

Direct computation gives 

OO OO 

i=l 1=1 





|A| 


2 




1 -o 1 





EOO -o I 

Because the compound Poisson process has independent increments, we let = = 
{x}, then 

= 4Pf|S| + l < -) + + max dry (^(^goS), ^ (^goS + )) , 
\ u/ a i<j<k ^ ^ 

where for any two point process distributions P and Q on J^, (i^y (P, Q) := inf^^p^^q 7^ 
77). Noting that {yUj} are all diffuse and consequently = a.s. for each x G P, we have 

ei,^(S) = 0, ei,4S^) = f4S^)(H4{a;}) + l)H^.({x})/2, e2,x(Sx) = rx(S^). (4.3) 
It is well-known that if Y follows Poisson distribution with mean c, then 
dto(^y,^(F + 1)) = maxP(F = j) < ^ 



i>o V2ec 
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[see Barbour, Hoist & Janson (1992), Proposition A. 2. 7, p. 262]. Hence, we have 



It is easy to see that we can write |S| = Yl^=i Vii where all the random variables V and rji^s 
are independent, V ~ Poisson(|/i'|) with /i' = Xli^i/^j; ?7i's have the same distribution 
^{Vi = j) = J ^ 1- If we take u = 2, noting that a < we have 

P + 1 < I) < P (v^ < < 0(1/.'!-) < O(a-). 



Hence, 



f^(S^) = { a-^ max 1 A , ^ 1 . (4.4) 



Using the independent increments again, we get 

Var(|S|) = E^(|H| - \\\)E{dx) = ^E(|S,| + 1 - \\\)\{dx) = ^E(S,({x}) + l)X{dx) 
E(|S|-1)|S|2 = E^|S|(|S-5,|)S(rfx) = ^E(|S,| + l)|S,|A(rfx) 

= |A|E|H|2 + 2|A| ^ lEE^{{x})\{dx) + 1^1' + ^ E(H,.({x}) + l)H,({a;})A(dx), 
which in turn imply 

^ EE,{{x})X{dx) = Var(|H|) - |A|, (4.5) 
^ E(S,.({x}) + l)H..({a:})A(dx) = E(|H| - \X\)' - Var(|H|). (4.6) 
Applying Theorem 13.21 (14. 3114. 6p . together with < 6 < 1, gives 

0?2(CP(/ii,/i2,...),7ra^6;0;i.) 

< 2doig) + ^E [f,(S,) ((S,.({x}) + l)S,({x}) + S,({x}) + 1)] Xidx) 

< 2do{g) + ola-'nmx lA^l==] [ E [{E,{{x}) + l)E,{{x}) + E,{{x}) + l] X{dx) 

= 24(e)+o(a-mglA^=i=)E(|E|-|A|)^ (4.7) 
Finally, one can verify directly that 

i=l / 1=1 1=1 

which, together with (14. 7p . implies (14. 2p . □ 
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4.3 Runs 



In the final example, we consider the point process of fc-runs of I's in a sequence of inde- 
pendent and identically distributed Bernoulli random variables [cf Example 5.2 of Barbour 
& Mansson (2002), p. 1527]. It is easy to see from our derivation that, at the cost of more 
notational complexity, one can lift the assumption of identical distribution. 

To begin with, let /i,--- , /n be independent Bernoulli random variables with identical 
distribution 

P(/. = 1) = 1 - p(/. = 0) = p, l<i<n. 

Let Xi = YYjt!i~^ h with k > 2, where we take Ij = Ij-n for j > n to avoid the edge effect. 
We define the point process of runs as 

n 
i=l 

on r = [0, 1], with being identified the same as 1 and the distance on the circle do{x,y) = 
\x — y\A{l — \x — y\). A point of S at location i/n indicates that there is a run of /c I's starting 
at index i and it is clear that the run may overlap with others around it. Wang & Xia (2008) 
demonstrated that Var(|S|) > 1E|S| if and only if 2 + {2k — l)p^ — {2k + l)p^^^ > 0, and 
the latter is easily satisfied if p < 2/3. Hence we only consider negative binomial process 
approximation to the distribution of S. 

Theorem 4.8 Let k > 2 be a fixed integer, 

(1 - p)n/ ^ _ p[2-{2k + l)/-i + {2k - 1)/] 

" ~ l+p-{2k + 1)/ + {2k - l)p^+i ' ~ l + p-{2k + l)p'= + {2k - l)p''+^ ' 

and u{dx) = ^ Yl^=i ^i/n{dx). Assume p < 2/3, then 

d.{J^E..^,,.,.,)<{^{>^)- 

[ 0{p), if np'^ < 1. 

Remark 4.9 The point process of runs in Barbour & Mansson (2002), example 5.2, is 
defined on the carrier space T' = [0,n] with being identified the same as n and metric 
dQ{x,y) = (|x — yip'') A 1, where | ■ | is the distance on the circle. Although do seems to be 
a natural choice in the context of compound Poisson process approximation, it depends on 
the mean of the process being approximated. An unexpected effect is, when the parameters 
vary, it is impossible to judge from the error estimates whether the approximations become 
better or worse. Another defect of the approach in Barbour & Mansson (2002) is that a 
factor Inn appears inevitably in the approximation bound, which makes it useless when n 
becomes large. In practical applications, p is often fixed while n tends to be large so that 
approximate distributions are needed. Our approximating distribution uses fewer parameters 
but achieves approximation bound that decreases when p becomes small and/or n becomes 
large. 
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Proof of Theorem \4-8\ It's easy to verify that the mean measure of H is X{dx) = J2^=i ^i/n{dx), 



E|H| = |A| = np^ and Var(|S|) = ^ (l + p - {2k + l)p'' + {2k - 1)/+^), hence we set 

1 



i=l 



Var(|S|) - E|S| _ p[2 - {2k + l)p^-^ + {2k - 1)/] 



Var(|H|) 
a = (1 - h)np^ = 



l + p-{2k + 1)/^ + {2k - ' 
(1 — p)np^ 
l+p-{2k + l)p^ + (2A; - l)p^+^ ■ 



We assume |A| > 1 first. To tackle the dependence resulting from the overlapping runs, 
we introduce the neighbourhoods Aj/„ = {j'/n: i — k + l<j<i + k — 1} and Bijn = 
{j /n : i — 2k + 2<j<i + 2k — 2}, where j is interpreted as j ' + n if j < 0, and 
j — n if j > n. Next, we choose Q = {Gj : I < j < In} by taking In = O {n^/'^p^'^-'^)/^^ 



Gj 

Ml, 



[sj^i/n, Sj/n] for j = 1, . . . ,ln, where Sq = 0, Sj = + for j = 1, . . . , /„, with 



O (n2/3p(2-fc)/3) and J^ti 



To estimate r^., we take u = 2, write x = i/n and = X]|j_j|>4fc_4-^j- Applying the 
Bienayme-Chebyshev inequality gives 



However, 



p {e{b:) + i<-^ 

< P (\Yi-'EYi\ > 



I -Bx 



< P ( K + 1 < < P f - EKi < - - EKi 
uJ \ u 



{n - {8k - 7))/ 



< 



((1 + 6)npV2 - {8k - 7)p^Y 



KI- (4-8) 



|>-j|>4fc-4, ■u=l, 2,3,4 ?^=1 



and the summand reduces to if one of the jy^s is not in the neighbourhoods of the others, 
hence 



E(F,-Ey,f < 12|Ap 



1 _ pfc 
1 — p 

This, together with (14. Sp . implies 

\ u 
The same argument also leads to 



9|A|(12A;-9) =0(|Ap). 



<0(|A|-). 



P (s, (S 



+ 1 < - 
u 



(4.9) 



(4.10) 



For / G ^ry, we will show that 



|E [/ (^,o - / (^,o + 5*,)] I < O ( 



l/3„-(fc+l)/3 



n ' p 



)• 



(4.12) 
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In fact, if we write tj = [sj-i + Sj)/{2n), x = i/n, then there are two cases to consider. 

Case 1. Sj-i < i < Sj. Because of the symmetry of our argument, we assume without 
loss of generahty that i < £ii±£i. We write Ii = (Ji, . . . , /i+2fc-2, • • • , 4), I2 = 

(/j+2fc-i, ■ ■ ■ , Is,-k), V = . . . , Vi+2k-2, Vs^^k+i, ■■■,v„). For any vector v with Vi E {0, 1}, 
V/, due to Wang and Xia (2008, Lemma 2.1), the number H^(v, I2) of /c-runs of the sequence 

■^Sj-i+l; • • • ; Vi+2k-2, Ii+2k-l, • • • , hj-k, Vsj-k+1, • • • , Vg^ 

satisfies 

dU^W{^^, I2), ^(PF (v, I2) + 1)) < O (n-^/V^'+'^/') • (4.13) 
For ease of notation, we use H(v, I2) to stand for the point process of /c-runs of the sequence 

Vl, ■ ■ ■ 1 Wi+2fc-2; Ii+2k-l-i ■ ■ ■ i Isj-ki Vs^-k+li ■ ■ ■ i ^n- 

Then, for / G ^tv, 

|E [/ {^go - f i^go {E\bc) + 5t^) I II = V] I 

< drv {^{^go (S(v,l2) \B^)),^{^go (S(v,l2) Isg) +5*,)) 

= rfi„(^iy(v,i2),^(iy(v,i2) + i)), 

and this, together with f l4.13p . yields that 

|E [/ i^go {E\s,)) - f i^go {E\s,) + 5,,) I E\s^] \ < O (n-Vs^-C^^-^)/^) . 

Case 2. i ^ (■^j-i, Sj]. The proof is omitted since it is essentially the same as that of case 
1 with some minor change of notations only. 

The proof of (14. 12p is similar. Now, combining (I4.9ti4.12| yields ra;(S) and fa;(S) are 
both bounded by O (|A|"^ (n-'^/3p-{k+i)/3^y These, together with some crude estimates, e.g. 
ES^(^x) < {2k-2)p,EE,{A,)E,{BMx) < {2k-2Yp etc., imply that Eei,,(S,), Eei,,(S) 
and b^e2^xi^x) are all bounded by O (|A|~^ (^ j^- 1/2^9- ('=+i)/3j^ p. Therefore, if |A| > 1, the proof 
is completed by substituting these estimates for the corresponding terms in Theorem 13.21 

Finally, if |A| < 1, we take In = O (p~^), ui, . . . ,ui^ = O (np). Then the right hand side 
of (14. 9 p and (I4.10p can be replaced with 0, and the upper bounds for (14. lip and (I4.12p 
become 0(1), which in turn imply that r^i'^) and f^iE) are both bounded by 0(|A|~^). 
Consequently, lEei^xC^x) , ^^i,x{'^) and b]Eje2,x{'^x) are all bounded by O {\X\~^) p. We then 
employ Theorem 13.21 to obtain the bound p, as claimed. □ 



5 Proof of Theorem 12. 4L 



The proof of Theorem 12.41 relies on the coupling and analysis techniques. The main obstacle 
in coupling various birth-death systems together is the difficulty of identifying the individual 
particles from their locations. To circumvent the repeats of points, we need to lift the space to 
a higher- dimensional carrier space and tackle the problem in the lifted space. Such technique 
has been proved very effective in handling this type of situations [Chen and Xia (2004) and 
Xia (2005)]. 
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5.1 Lifting the carrier space 



In this subsection, we define F = F x [0, 1] and the pseudometric do on F as 

Let ^ be the class of all finite integer-valued measures on F and di be the induced pseu- 
dometric from do in the same way as di from cIq. For ^ G = '^(xi.ti), we define 
^|r = ^"=1 and extend a function / G to a function on by 

hi) = /(llr). 

It is not hard to check that for each / G / is a di-Lipschitz function: — f{C,2)\ < 

di{Li2)ioT all {1,6 

Next, we define /i as the product measure of /i and Lebesgue measure on [0, 1]. Regardless 
of whether is diffuse, the measure /i is always diffuse on F. Let 

JhiO = {a + b\i\) J_{h{i+S,)-h{0)jjidx) 

+(1 + m - 1)) [ M - 4) - kmids:). 



Birth-death systems on F with the generator evolve in the same way as birth-death 
systems on F with the generator 

To carry out the proof of Theorem 12.41 for a given birth-death system Z^(-) with ^ = 
SiLi ^a^i' '^^^ ^^^^ ^^(■) by setting a ^ G consisting of distinct particles at (xj, tt), 
1 < i < n, where ti, . . . , t„ are distinct elements of [0, 1], and throwing each new born 
particle at z equally likely onto {z} x [0, 1], independently of the others. Then, 

/(Z^-(t)) = /(%(t)|r) = /(Z^(t)), Vt>0. 

This procedure enables us to assume from now on that, without loss of generality, /i is diffuse 
and the particles at ^, rj, x and y are all distinct. 



5.2 Proof of (ESD 

First of all, the proportion of the surviving initial particles at time t can be estimated as 

To this end, we define g{Q := |?7nC|/|C| for the fixed rj G M', where 0/0 is interpreted as 0. 
Recall that V{Q has the uniform distribution on the sites in and we have ^g{C — ^y(c)) — 
g{C)- Hence 

j^g(C) = {a + b\C\){]Eg{C + 5u)-g{0) 



19 



since the last term in fl2.2p vanishes. Noticing that with probabihty 1, U ^ rj, we have 

g(C + M-9(C) = hnc|(^-^)=-^^|^ a.. 

It follows that 

^g{0 < min 1-^1^, -(a A b)g{o\ . 



aAb)ip{t)\. (5.2) 



. 2|C|2 

Therefore, setting (p(t) = lE5f(Z,,(t)), we have 

^\t) = E^giZ,it)) < min {-^E^j^^, 

By the Cauchy inequality, 
where the second inequality holds since each particle dies with rate at least 1. Therefore, 



|z,(t)|2 - \^\ V |z,(t)| 



This, together with (15. 2p . yields 

ip\t) < min {-^'/'W'. -(« /\ ^)V'W} • 

Therefore, ( 15. ip follows from the fact that V5(0) = 1. 

Next, suppose t] G |?7| = n and the particles at x,y and r/ are all distinct. We start 
with Z^+5^(-) and construct Z^_|_5^(-) by replacing x with Let Tz = mf{t : z ^ Z^+5^(t)} 
for z e ?7 + (5a;. Then, Z^+5^(t) = Z^+5^(t) for t > r^;. For t < Tr,, 

\f{Z,^sAt)) - fi^v+sM\ < t/i(Z,+,,(t),Z,+,^(t)) < ^ 

Therefore, 

\h{v + 5.) - h{v + 5y)\ < lE-^^^dt. (5.3) 
Notice that for all z & r] + S^, 



which implies that 



]g l{r.>t} ^ 1 jg E^g,,+^, l{r.>t} ^ 1 jg l(^ + ^x) nZ^+6^(t)| 

|Z,+5,(t)| n + 1 |Z,+5,(t)| n + 1 |Z,+5,(t)| 
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This, together with fl5.3p and (15. ip . imphes that 



^ + lio l + 2(;^(e*-l) n+1 2 



^1/1 2\ _ 1 1 
where the result also includes the case a = 2{n + 1) , and 

POO 1 1 



On the other hand, 



n + 1 {a Ab){n + 1)' 



E ,i^"^^;\, < P(r,. > t) < e"*, 

\^ri+SA't)\ 



hence < 1. □ 



5.3 Proof of ( 12761 ) 

Suppose 1^1 = n and particles at ^, 77, a; and y are all distinct. Recall that s^h{^ + 6x) = 
/(e + 5.)-7r(/),i.e. 

a„+iE/i(^ + S^ + Su) + /3„,+ilE/i(^ + - 5v{^+s^)) - + (3n+i)hi^ + 4) 
= + (5.4) 

It follows that 

E/i(e + 4 + (5c/) 

= ^ h{^ + dx) ltjh{^ + - dv{^+s,))- 

Oin+l ttn+1 ttn+1 

Hence 

A2Me;a:,|/) = h{^ + 6., + 6y)-]Eh{^ + 6^ + 6u) + h{^ + 6.,)-h{^ + 6y) 
+E/i(e + 5. + 5c/) - 2/i(e + 4) + /^(O 
= hi^ + 4 + 5y) - E/i(e + 4 + 5i/) + hi^ + 4) - hi^ + 5,) 
, /(e + 5.) - 7r(/) 



/j.+i ^ + 5. - - + 5x)) . (5.5) 

Swapping x and y, we get 

A2/i(e;2/,a;) = + 6y + 6^) - Eh{^ + 6y + 6u) + + 6y) - + 6^) 
, /(e + 5,)-7r(/) 



ttn+l 



Ii{h{0-h{^ + 6y-6vi^+Sy))) 



^"+^ E + 5, - 5y(^+,^)) - Me + 5,)) . (5.6) 
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Since A2h{C,] x, y) = A2h{C,] y, x) and \ f — 7r(/)| < 1, we take the average of (15.51) and (15.61) 
to reach the bound 



\A2h{^;x,y)\ < + + + 



+1 



n+1 



A. 



where 



A„ = sup{\h{r] + 4) - h{r]) \ ■.\r]\=n,xe T}. 
On the other hand, we use (15. 4p again to obtain 



/(e + 5,) - 7r(/) , a„+i + /3„+ 



and argue in the same way as for (15. 7p to get 



-Me + 5.)-^E/i(e + 4 + M, 



\A2h{^;X,y)\ <- + Cn-1 + Cn+1 

Pn+1 

In subsection 15.41 below, we will prove that 



Pn+l — «ri+l 



+1 



An+l- 



Q-n+l—lin 



7^A„ < ^ + C„, if > 
^"+^~""+' A„+i < + C„, otherwise. 



3n + l 



and so it follows from (15.71) and (15. 8p that 



\^2K0\ < Cn-l + C„ + C„+i + 2 ( ^ A 

Pn+1 



(5.7) 



(5.8) 



(5.9) 



(5.10) 



For n = 0, (12. 5 p enables us to conclude that < 1, C_i = 0, and it follows from (I5.10p 
that 

|A2MOI<2 + -<^ + -. 

a n + 1 a 

For n > 1, using the estimate Ck < 2{k+i) \ P-^P - the fact 2?7, > + 1, and the bound 
given in (15.100 . we have 



|A2MOI<7^ 



2(A:+1) 
1 



1 



5 2 5 
- < 



2n 2(n + l) 2(n + 2) a~n + l a 
This completes the proof of (12. 6p . 



□ 



5.4 Proof of a^M 

Since {|Z^(t)|, t > 0} is a birth-death process with birth rates {a^}, death rates and 
initial value |?7|, we follow the convention in Brown and Xia (2001) to define T\r,\^k = inf{t : 
|Z^(t)| = /c}, r+ = Tm,m+i and r" = r^.m-i- 
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For any rj G with {rjl = n, by the strong Markov property of {Z^(t), t > 0}, 
h{r)) = -E / " (/(Z,(t)) - 7r(/))rft + E/i(Z,(r+)), 

which imphes that 

|/i(r/)-E/.(Z,(T:+))| <Er+. (5.11) 

Now we compare E/i (Z^(r^)) with /^(r] + 6x). Let ii'^ be the number of particles in r] that 
have died before r^. Clearly, < < n. Given = k, there are at most k + 1 pairs of 
mismatched points between Z^(r+) and r] + Sx, consequently, 

|E [h (Z,(r+)) \K+ = k)- h{v + 6x)\ < Cn{k + 1). 

This in turn leads to 

\m{Z^{T+)) -h{r] + 6x)\ <Cn(EK+ + l). (5.12) 
Combining ( ]5.1ip and (15. 12^ gives 

A,, <Er+ + a(lEir+ + l). 

Likewise, for r] G Jif with |?7| = n + 1, it follows from the strong Markov property of 
{Zr,+sAt), t>0} that 

h{rj + 5x) = -E r^\f{Zr,{t))-7r{f))dt + m{Z,+sArn+2)), 
Jo 

giving 

\hiv + 6x) - E/.(Z,+5.(r-+2))| < ]Er-+2- (5-13) 

Let K~^2 be the number of particles in 77 + that have died before T~_^_2, then there are at 
most K~_^_2 mismatched pairs of points between Z^+5^ (Vf2) 77, leading to the bound 

\m{Z,^sArn+2)) - hiv)\ < Cn^K^2- (S-M) 
Collecting the estimates f l5.13p and f l5.14p . we obtain 

A„+i < Er„-+2 + CnlE/C+2- 

Put F{k) = XliLo^i -^(^) — Y^'iLk'^^i- Lemma 2.2 and Lemma 2.4. in Brown & 
Xia (2001), 

^ + ^ik) ^ Fik) 

Er+ = Er- = -i-^. 



since — ctfc-i ^ f^k — Pk-i for all /c. It follows from the first inequality of f l5.15p that 

Pn+i)F{n) ^ /3„+iF(n + 1) - /5,+iF(n) 
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which in turn yields 



F(n) 1 
Er+ = < , if f3n+i < a„+i. 



Likewise, using the second inequahty of flS.lSp . we get 

F{n + 2) 1 



Pn+2'^n+2 Pn+1 — ^n+l 



To complete the proof of f l5.9p . it remains to show 



(EJC + 1)<1, if < an+i, (5.16) 

— On+l - 



+1 



-Ei^-+2<1, if > (5.17) 



To this end, we derive a recursive formula for lE/T^ and EK^, m > 1, in Lemma [5.11 later 
and give their estimates in following Lemma 15.21 In particular, since — (3^ decreases in k 
and afc increases in k, it follows from Lemma [5.21 that, if > Pn+i, 

l + Eir+<^^<^^^, 

Oin — Pn ttn+1 ~ Pn+1 

which is equivalent to f l5.16p . On the other hand, noting that 

/3n+2/(/3n+2 " «n+2) < Pn+l/Wn+l " «n+l) 

as /3n+i —an+i > 0, applying Lemma \5l2\ again, we obtain 1E_R'^^_2 < /3n+i/(/3n+i — On+i) and 
hence fl5.17p follows. □ 

Lemma 5.1 The following recursive formulae hold for m > 1: 

m/3^.(l + Ei^+_,) , (m - l)a„Eir-+i 



ma„ + /3„(l + Ei^'+_i)' ™ a^Ei^^-^^ + (m + l)/3„ 



Proof. Noting that all particles die equally likely, an initial particle in the initial configuration 
r] with \ri\ = m dies before r+ with probability ^lEi^+, and if survives, it dies before Tm,m+2 
with probability ^^^Ei^^_,_]^. That is, the probability that an initial particle dies before 

'Tm,m+2 IS 

-EkA ^^Ei^++,. 

m y m ""J m + 1 ""^^ 

Therefore, there are in average EK^ + ^K^^^ initial particles die before Tm,m+2- 

On the other hand, = means that the first change of the configuration of the birth- 
death system Z„(-) is a birth, so Kt. = with probability — "™ . However, if the first 

change is a death, which happens with probability ^ , then one particle at some site 
X oi 7] will die at r,, = inf{t : Z^(t) 7^ 77}. In the latter case, using the conclusion in the 
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preceding paragraph, the mean number of particles in r] — 5x dying before the birth-death 

system Z^_5^(-) reaches the 
estabhshed the relationship 



system Z^_5^(-) reaches the size m + 1 is WjK^_^ + ^ ^^"^ ^ lEK^. In summary, we have 



Ei^+ = 1 + Eir+_, + — ^E/^J 

which is equivalent to the first recursive formula. 

The same argument can be adapted to prove the second recursive formula. In fact, assume 
|?7| = > 2, an initial particle in rj dies before Tk,k-2 with probability 

Now, let |?7| = m. With probability ^ , the first change of Z^(-) is a death, giving iiT;^ = 1. 
Assume next that the first change is a birth, then, as shown above, each initial particle dies 
before the size reaches m — 1 with probability ■:^^^'\EjKI^_^_^ + (l — ^^^E/i„"^^J ^EK"". It 
then follows that 

and reorganizing the equation yields the second recursive formula. □ 
Lemma 5.2 If am > Pm, then 

l + Ei^+<-^. 

If f3m > am, then, 

Proof. Suppose am > Pm- By Lemma ISTTl and that 'EiK^_^ > 0, we have 

1 + EJ<+ < 1 + ^(1 + Eir+_i), Vm>L (5.18) 

Iterating f l5.18p and noticing that Pk/otk is increasing in k as well as = 0, we conclude 

that 

m—l / n\ l/n\m 

ar 



Assume am < Pm- Using Lemma [5TT] again together with the fact that MK^j^^ > 1, we 
have 

a 

EK- <l + -^Ei^-+,. (5.19) 

Pm 
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Noticing that ak/f3k is decreasing in k, we conclude that 



-c4(|:)^(t:)'-™. 



by iterating fl5.19p . RecaUing ]EjK^^^_^^i < (m + /), we have, by letting / — oo, that 



□ 



6 Proof of Theorem 13.2 



Let X be a point process with distribution TTa,b;Q:u^ then by the triangle inequality, we have 

It follows from (13 .Sp that both (i2(ifS, if (^goH)) and d2{^{^goX),7rafi-o;u) are bounded 
by dQ{Q), so it remains to estimate (i2(if (^goS), if (^goX)). Clearly, ^goX ~ '^a,b;oy, 
where 

k 

v\dx) = Y,^iGi)5tXdx). 

i=l 

Using the Stein equation (12. 3p with tt = T^a,bfl;v'^ it suffices to show that for each / G ^ , 

\E£/hf{^goE)\ = |E/(^goH) - 7ra,b;0y{f)\ 

< / E [(1 + 6)(ei,,(H,) + ei,,(H)) + bf,{E)E,{A,) + 6e2,,(H,)] A(rf?/). (6.1) 

To simplify the notation, we fix / G write /'(?]) = f{-y^g°Tj), h'irj) = hf{^gor]) and 
define 

Ah'{^;x) = h'{^ + 5^)-h'{0- 

Noting that h' acts on the 'shuffled' configurations so one can swop u' for u in s^h', we apply 
fIXT]) to expand E^//i'(S) as 

E^/i'(S) = j ^[Ah'{Ey + 5y-x)- Ah'{E-x)]\{dy)v{dx) 
+ y E[-A/i'(S^;a;) + A/i'(H;x)]A(dx) 

+ y EA/i'(H; x)[az/(dx) + 6|A|z/(rfx) - \{dx)]. (6.2) 

The last term vanishes since (a + &|A|)z/ = A, which is ensured by the facts that \v\ = 1 and 
a= (1-6)|A|. 
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To study the first term in (16.21) . we take a coupling (Gj^, Ty, Uy) of H|^c (notice tliat it lias 
the same distribution as that of S^^l^c), Sj^^, and '^y\Ay, such that ^{Qy + T^,) = and 
^{Qy + Uy) = ^{Ey). Dropping the subscript y from (6^, Ty, Uy), we can write 

lE{Ah'{Ey + 6y; x) - Ah'{E; x)} 
= ]E{Ah'{Q + U + 6y;x) - Ah'{Q + T;x)} 

= E{[Ah'{e + U + 6y,x)- Ah'{Q; x)] + [Ah'{Q; x) - Ah'{Q + T; x)]} . 

When expanded telescopically, it is the sum of |n| + 1 positive A2/i'-functions for the term in 
the first pair of square brackets, and |T| negative A2/i'-f unctions for the term in the second 
pair of square brackets. Similarly, the second term in (16.21) can be expressed as the sum of 
|T| positive A2/?.'-functions and |n| negative A2/i'-functions. Therefore, when 

b ^ n^y{Ay) + 1 - E{Ay))X{dy) + ^ n^{Ay) - Ey{Ay))X{dy) = 0, (6.3) 

the expected numbers of positive and negative A2/i'-functions are then balanced. Noting that 

^ n^y{Ay) - E{Ay))X{dy) = Var(|H|) - E|S|, (6.4) 

we obtain (16.31) by taking b = ^^'^y^j^^Z^^'^^ . Now, we denote 11 = Yl^^i ^xj, T = Yl^j^i 5 

for r] = J21=i ^Zi, write (77)0 = 0, (r/)^ = J2i=i ^ ^ j ^ Taking E as an independent 

copy of E, we can expand lE£/h'{E) into 

E^/i'(S) = ei + ■ ■ ■ + 65, 

where 

rr 

ei = bll Ej2[^2h'ie + {U)j^i + Sy;x,Xj) -'EA2h'{E;z,z)]\idy)u{dx), 



r2 



62 = b W.[A2h'{e;x,y)-'EA2h'{E;z,z)]X{dy)u{dx), 



r2 



|T| 

63 = -b[[ Ej2l^2h'iQ + {T)j^,;x,y,)-lEA2h'{E;z,z)]\{dy)u{dx) 



r 

64 = - / E V[A2/i'(e + (n)j_i; X, Xj) - EA2/i'(H; z, z)]X{dx) 
es = / EV[A2/i'(e+(T),_i;x,y,)-EA2/i'(S;^,2)]A(rfx). 



Now we concentrate on estimating ei, since others are similar. Recalling that S^l^g is not 
independent of Ey\Ay while Ey\B^ is, we can extract the part as Ey\B^ from ~ ^(Hyl^g), 
and denote it by 61. Take a more detailed coupling (9i,92,T,n) such that (0i,02) is 
a coupling of E\b^ and E\By\Ay (as well as Ey\B^ and Ey\By\Ay)^ and 0i is dependent of 
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(T,n). We then take (02, T) as a copy of (02, T) such that (02, T) is independent of 11 
and =Sf(0i + ©2 + T) = We insert A2/i'(0i; x, Xj) and A2h'{Qi] z, z) into the square 
brackets in ei to obtain 




(en H h ei5)\{dy)u{dx), 



where 

|n| 

en = E^[A2/i'(0i + 02 + (n),-i + 5y;x,Xj) - A2/i'(0i + (n),_i + (5y;x,Xj)], 
i=i 
|n| 

ei2 = E^[A2/i'(0i + (n)j„i + - A2/i'(0i + 

i=i 
|n| 

ei3 = E y^[A2/i\0i + Sy] X, Xj) - A2h'{Qi; x, Xj)], 
i=i 
|n| 

ei4 = E^[A2/i'(0i;x,Xj) - A2/i'(0i;z,2;)], 
i=i 

ei5 = E|n|E[A2/i'(0i;z,z) - A2/i'(0i + 02 + T;2,z)]. 

Estimates of en and eis . Notice en can be further decomposed as 
|n| leal 

E J] J] [A^h'iQi + 6y + (02, n)i,,_i; X, Xj) - A2/i'(0i + 6y + (02, n),_ij_i; x, x,)] , 

where (02,n)ij = (02)i + (n)j are measurable to (02,11). When we take the expectation 
conditional on "^ylsy, or equivalently on (02,11), it can be interchanged with the sums. 
Therefore, we concentrate on the conditional expectation 

E ( A2/i'(0i + Sy + (02, X, Xj) - A2/i'(0i + Sy + (02, n),_i,,„i; x, x,) | 02, H) . (6.5) 

Since by (12. 6p . there is no uniform bound for A2h', we write 

A2/i' = /i« + /i('), 

where 

/.(I) = min jmax (^A^h', , , h^'^ = A^h' - h^'\ 

Since 

9ll -1-5 (7 

\A2h'i^;x,y)\<^—ioTl + \^\>-, 
a u 

we have 

9')/ _L (7 

< T:!^ |/i(2)|<2, and/i(2)(^;a;,y) = 0for l + |e|>-. (6.6) 
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For the quantity given in (16. 5p . the differences based on h^^'^ and /i^^^ are respectively bounded 
by the second and the first terms of VyCEy), recalhng that 'Eylsy is equivalent to (62, H). 
Hence, 

|en| < E|n| ■ |e2|r,(S,) = Er,(S,)S,(A,)S,(5, \ Ay). (6.7) 

Similarly, taking conditional expectation on (6)2, , T), we get 

Idsl < lE|n|E(|e2 + t|r,(ei + 62 + t)) = Er,(S)H(fi,)ES,(A,). (6.8) 



Estimates of ei2 and 613. Notice that 62 disappears now and 61 is independent of 11. We 
use the conditional expectation on 11, and find each conditional expectation, actually being 



the mean, is less than ry{Ey) = ry{E). Hence 



leial < f,(H)E|n| = f,(S)EH„(^„). (6.10) 



Estimate of en. In fact, ei4 is another kind of difference that is very different from the 
other four since the two point processes have the same size. Let us state a result which tells 
us the cost of shuffling points x and y in A2h{C,; x, y). Define 

Dh'i^- X, y) = h\i + 5^) - h\i + 5y), D2h\i- x, y- z) = Dh\i + 5,; x, y) - Dh'{^- x, y). 

Then, one can directly verify the following equation: 

A2/i'(e; X, y) - A2/i'(e; z) = D^h'i^- y, z- x) + D^hO^t x, z- z). (6.11) 

Consequently, we can rewrite 

|n| 

ei4 = E ^[£)2/i'(0i; a^i, z\ x) + D2h'{Qi; x, z; z)], 
i=i 

bearing in mind H = Y^^i ^xy Now we estimate D2h' . Recalling \Dh'\ < Cn defined in (12.41) 
and estimated in (12. 5p . we have 

\Dh'{^;x,y)\ < 1 A 



2(lel + l) aj- 



If we set 



where 



= max jmin (^Dh', , -^^^} and /i^^) = Dh' - h^^\ 



then 

|;^(3)| < !i±^^ \h(^)\<l andh^^\C,x,y) = OioTl + \^\>-. (6.12) 
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Comparing with (16. 6p . we conclude that A2/i', as the difference of A/i', has conditional 
expectation (that reduces to its expectation) less than a half of ^^(S). Therefore, 

leul < fyiE)^^ = fy{E)EEy{Ay). (6.13) 
Collecting fl6.7ll6.inp and f l613l) . we obtain 

|ei| < b j^[]Ery{Ey)Ey{Ay)Ey{By\Ay) 

+fyiEMEyiAy) + 3)H,(A,)/2 + Er , ( S ) S ( 5, ) ES, ( A, ) ] A ( rft/ ) . (6.14) 

The same procedure can be applied to estimate 62 to 65 by first selecting the 'stepping 
stones' Eylsc and Hl^g to 'bridge' S^l^g and S ~ for 62 and 64, and Sj^g and S|bc to 
'bridge' 5|^c and S ~ =SfH in 63 and 65, then telescoping within the layer of dependence and 
using (16. lip and (I6.12p to deal with relocation of points. We omit the details here and the 
estimates are summarized below: 

|e2| < b j^[]EryiEy)Ey{By\Ay) + fyiE) + lEryiE)EiBy)]X{dy), 



lesi < bJ^[I<:ryiE)E{Ay)E{By\Ay)+fyiE)n^iAy) + l)EiAy)/2 

+ EryiE)EiBy)EEiAy)]\{dy), 
|e4| < J [Er,.(S,.)S,(A,)S,(5, \ A,) + f,.(S)E(S,(A,) + l)H,(A,)/2 

+ Er,{E)E{B^)]EE,{A,)]\{dx), 
lesI < J [Er,.(S)S(A,.)S(S,. \ A.,) + f,.(S)E(S(A,.) + l)S(A,)/2 

+ Er,{E)E{B,)lEE{A,)]\{dx). 

Now, the above four estimates, together with (I6.14p . yield (16. ip . completing the proof of 
Theorem [O □ 



7 Proof of Theorem [3T3 



The proof is similar to that of Theorem 13.21 with some modification to suit the estimation 
involving the second order reduced Palm processes. Let F be a point process with distribution 
'^a,0;i3;u, it foUows from the triangle inequality that 

rf2(^S, 7r,,0;/3;.) < ^2(^2, ^(^goS) ) +^2 (^(^goS) , ^(^goF ) ) +^3 (^(^goF ) , 7r,,0;/3;.). 

Again, ( 13. 3p implies that d2{^E, ^{^goE)) and (i2(-^(^g;o5^), tTq^O;^;!') are bounded by 
do{Q), so (i2(-^(-'^go2), ^(^goF)) is the only term to be estimated. 



30 



We replace tt by 'Ka,0;i3y in the Stein equation (12. 3 p with v'{dx) = Yl'i=i ^{Gi)5tXdx)- It 
is sufficient to prove 

E^//i/(^goH) < ^E(ei,,(H,) + ei,,(S))A(dx) 

+/3 jjw. {ei^^^yi^^y) + ei,x,j,(S) + e2,x,3,(H,3,)) \^''\dx, dy) (7.1) 

for all / G For the fixed / G we set f'irf) = /(^gor/), /i'(?7) = hf^^gorf) and then 
apply (13. ip and (13. 2p to deduce the following expansion 

E^//i'(S) = j W:[-Ah'{E^;x) + Ah\E;x)]\{dx) 

+P jj M[-Ah'{E^y + 5y]x) + Ah'{E-x)]\^'^\dx,dy) 

+ y AE/i'(S; x) (^az/(rfx) - A(rfx) - /3 ^ A'^^ (rfx, d?/)^ . (7.2) 

The last term of (17. 2p vanishes because of the definition of v in (13. 5p . and viT) = 1 ensures 
that 

a = \\\ + (3 jj A[2l(dx, dy) = |A| + /3(E|S|^ - |A|). 

We take S as an independent copy of S which is also independent of all H^.'s and H^-j^'s. 
Denote the points in E\a,, '^xIa^, '^\A^y, ^xy\A^y respectively by Xj, yj, Wj, Vj. Then using 
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the two types of local dependence, we have 



^{E[-A/i'(H,;x) + A/i'(S,.Uc;a;)]+E[A/i'(S;x)-A/i'(SUj,;x)]}A(rfx) 
-/3 JJjW.Ah'{E,y;x)-lEAh'{E^yU^^^;x)]X^^\dx,dy) 
-P jj^^ [EA/i'(S,, + 5y- x) - MAh\E,y- x)] A[2] {dx, dy) 
+/3 |^jEA/i'(H;x) -EA/i'(5U.^;x)]APl(rfx,rf|/) 

~ / ^ 5Z [^2/i'(Sx.Ug + (Sx.UJi-i;x,?/j) - A2/i'(H;xi,Xi)]A(rfx) 

+ / E V [A2/i'(HUc + (SUJ,-i;x,x,-) - A2/i'(S;xi,xi)]A(dx) 

-/3 / / E ^ [A2/i'(S^2;U=^ + (S^-yU,,)i-i;a;,t;j) - A2/i'(S;xi,xi)]A[^](rfx,c/y) 
-13 JJjA2h'(E,y,x,y)-A2h'{E-xi,xi)]X^^\dx,dy) 



E ^ [A2h\E\Ag^ + {E\Ajj-i;x,w,)-A2h'{E;xi,xi)]X^'\dx,dy) 



.=1 

-EA2/i'(S;xi,Xi) 
!.i + ■ ■ ■ + (/.g. 



j ]E{E,{A,)-E{A,))\{dx) + f3 jj ^E{E,y{A,y) + 1 - E{A,y))\^'\dx,dy) 

(7.3) 



The term becomes if we set 



^E(S,.(A,) - S(A,.))A(rfx) + /3 E(S,.,(A,.,) + 1 - E{A.,y))X^^\dx,dy) = 0, 



hence the /3 in (13 ■4p follows from (16.41) . J/^a X^'^^{dx,dy) = E|H|(|H| — 1) and the following 
observation 



r2 



E{E.,y{A.,yyE{A,y))X^'\dx,dy) = // E(|S,.,|-|S|)APl(da;,dy) = E(|S|-2-|A|)(|H|-l)| 



r2 



Following the same steps as the estimation of (16.141) . with 'stepping stones' E^Ib^ and E\b^ 
for 01, S|ijc and E\b^ for 02, S^jj^Ibc^ and E\bc^ for 03 and 04, and S|ijc^ and H|bc^ for 05, we 



32 



obtain 

01 < y Eei,^(E:^)A(dx); 

03 < /3 JJ^^'Eei,,,yiE,y)X^^\dx,dyy, 

04 < (3 jj^Ee2,.,y{E,,y)\^^\dx,dy); 

05 < 13 jj^Ee^,,,y{E)\^^\dx,dy), 

which, together with f l7.3p . in turn imply (17. ip . This completes the proof of Theorem 13. 3[ □ 
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